• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) »çÀü ÇнÀµÈ Encoder-Decoder ¸ðµ¨ ±â¹Ý ÁúÀÇÀÀ´ä ½Ö »ý¼ºÀ» ÅëÇÑ ±â°è µ¶ÇØ ÇнÀ µ¥ÀÌÅÍ Áõ°­ ±â¹ý
¿µ¹®Á¦¸ñ(English Title) Training Data Augmentation Technique for Machine Comprehension by Question-Answer Pairs Generation Models based on a Pretrained Encoder-Decoder Model
ÀúÀÚ(Author) ½ÅÇöÈ£   ÃÖ¼ºÇÊ   Hyeonho Shin   Sung-Pil Choi  
¿ø¹®¼ö·Ïó(Citation) VOL 49 NO. 02 PP. 0166 ~ 0175 (2022. 02)
Çѱ۳»¿ë
(Korean Abstract)
±â°è µ¶ÇØ ¿¬±¸´Â ¹®¼­¿¡¼­ Áú¹®¿¡ ´ëÇÑ Á¤´äÀ» ã´Â °ÍÀ¸·Î ´ë±Ô¸ð µ¥ÀÌÅÍ°¡ ÇÊ¿äÇÏÁö¸¸ °³ÀÎ ¿¬±¸ÀÚ³ª ¼Ò±Ô¸ð ¿¬±¸ ±â°üÀÌ ±¸ÃàÇÏ´Â °ÍÀº ÇÑ°è°¡ ÀÖ´Ù. ÀÌ¿¡ º» ³í¹®Àº »çÀü ÇнÀ ¾ð¾î¸ðµ¨À» È°¿ëÇÑ ±â°è µ¶ÇØ µ¥ÀÌÅÍ Áõ°­ ±â¹ýÀ» Á¦¾ÈÇÑ´Ù. ±â°è µ¶ÇØ µ¥ÀÌÅÍ Áõ°­ ±â¹ýÀº ÁúÀÇÀÀ´ä ½Ö »ý¼º ¸ðµ¨°ú µ¥ÀÌÅÍ °ËÁõ ¸ðµ¨·Î ±¸¼ºµÈ´Ù. ÁúÀÇÀÀ´ä ½Ö »ý¼º ¸ðµ¨Àº Á¤´ä ÃßÃ⠸𵨰ú Áú¹® »ý¼º ¸ðµ¨·Î ±¸¼ºµÇ¸ç, µÎ ¸ðµ¨ ¸ðµÎ BART ¸ðµ¨À» ¹Ì¼¼ Á¶Á¤ÇÏ¿© ±¸ÃàÇÏ¿´´Ù. µ¥ÀÌÅÍ °ËÁõ ¸ðµ¨Àº Áõ°­ µ¥ÀÌÅÍÀÇ ½Å·Ú¼ºÀ» ³ôÀ̱â À§ÇØ º°µµ·Î Ãß°¡ÇÏ¿´À¸¸ç, Áõ°­ µ¥ÀÌÅÍÀÇ È°¿ë ¿©ºÎ¸¦ °áÁ¤ÇÑ´Ù. °ËÁõ ¸ðµ¨Àº ELECTRA ¸ðµ¨À» ±â°è µ¶ÇØ ¸ðµ¨·Î ¹Ì¼¼ Á¶Á¤ÇÏ¿© »ç¿ëÇÏ¿´´Ù. Áõ°­ ±â¹ýÀ» ÅëÇÑ ¸ðµ¨ ¼º´É °³¼±À» È®ÀÎÇϱâ À§ÇØ KorQuAD v1.0 µ¥ÀÌÅÍ¿¡ Áõ°­ ±â¹ýÀ» Àû¿ëÇÏ¿´´Ù. ½ÇÇè °á°ú ±âÁ¸ ¸ðµ¨ ´ëºñ EM ScoreÀÇ °æ¿ì ÃÖ´ë 7.2 »ó½ÂÇÏ¿´°í F1 Score´Â ÃÖ´ë 5.7 »ó½ÂÇÏ´Â À¯ÀǹÌÇÑ °á°ú¸¦ µµÃâÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
The goal of Machine Reading Comprehension (MRC) research is to find answers to questions in documents. MRC research requires large-scale, high-quality data. However, individual researchers or small research institutes have limitations in constructing them. To overcome the limitations, in this paper, we propose an MRC data augmentation technique using a pre-training language model. This MRC data augmentation technique consists of a Q&A pair generation model and a data validation model. The Q&A pair generation model consists of an answer extraction model and a question generation model. Both models are constructed by fine-tuning the BART model. The data validation model is added to increase the reliability of the augmented data. It is used to verify the generated augmented data. The validation model is used by fine-tuning the ELECTRA model as an MRC model. To see the performance improvement of the MRC model through the data augmentation technique, we applied the data augmentation technique to KorQuAD v1.0 data. As a result of the experiment, compared to the previous model, the Exact Match(EM) Score increased up to 7.2 and the F1 Score increased up to 5.7.
Å°¿öµå(Keyword) µ¥ÀÌÅÍ Áõ°­   ±â°è µ¶ÇØ   ÀÚ¿¬¾î 󸮠  Áú¹® »ý¼º   Á¤´ä ÃßÃâ   data augmentation   machine reading comprehension   question generation   natural language processing   answer extraction  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå